Search CORE

36 research outputs found

Brief Announcement: The Fault-Tolerant Cluster-Sending Problem

Author: Hellings Jelle
Sadoghi Mohammad
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 33rd International Symposium on Distributed Computing (DISC 2019)
Publication date: 01/01/2019
Field of study

The development of fault-tolerant distributed systems that can tolerate Byzantine behavior has traditionally been focused on consensus protocols, which support fully-replicated designs. For the development of more sophisticated high-performance Byzantine distributed systems, more specialized fault-tolerant communication primitives are necessary, however. In this brief announcement, we identify the cluster-sending problem - the problem of sending a message from one Byzantine cluster to another Byzantine cluster in a reliable manner - as such an essential communication primitive. We not only formalize this fundamental problem, but also establish lower bounds on the complexity of this problem under crash failures and Byzantine failures. Furthermore, we develop practical cluster-sending protocols that meet these lower bounds and, hence, have optimal complexity. As such, our work provides a strong foundation for the further exploration of novel designs that address challenges encountered in fault-tolerant distributed systems

Dagstuhl Research Online Publication Server

Coordination-Free Byzantine Replication with Minimal Communication Costs

Author: Hellings Jelle
Sadoghi Mohammad
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Database Theory (ICDT 2020)
Publication date: 01/01/2020
Field of study

State-of-the-art fault-tolerant and federated data management systems rely on fully-replicated designs in which all participants have equivalent roles. Consequently, these systems have only limited scalability and are ill-suited for high-performance data management. As an alternative, we propose a hierarchical design in which a Byzantine cluster manages data, while an arbitrary number of learners can reliable learn these updates and use the corresponding data. To realize our design, we propose the delayed-replication algorithm, an efficient solution to the Byzantine learner problem that is central to our design. The delayed-replication algorithm is coordination-free, scalable, and has minimal communication cost for all participants involved. In doing so, the delayed-broadcast algorithm opens the door to new high-performance fault-tolerant and federated data management systems. To illustrate this, we show that the delayed-replication algorithm is not only useful to support specialized learners, but can also be used to reduce the overall communication cost of permissioned blockchains and to improve their storage scalability

Dagstuhl Research Online Publication Server

Brief Announcement: Revisiting Consensus Protocols through Wait-Free Parallelization

Author: Gupta Suyash
Hellings Jelle
Sadoghi Mohammad
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 33rd International Symposium on Distributed Computing (DISC 2019)
Publication date: 01/01/2019
Field of study

In this brief announcement, we propose a protocol-agnostic approach to improve the design of primary-backup consensus protocols. At the core of our approach is a novel wait-free design of running several instances of the underlying consensus protocol in parallel. To yield a high-performance parallelized design, we present coordination-free techniques to order operations across parallel instances, deal with instance failures, and assign clients to specific instances. Consequently, the design we present is able to reduce the load on individual instances and primaries, while also reducing the adverse effects of any malicious replicas. Our design is fine-tuned such that the instances coordinated by non-faulty replicas are wait-free: they can continuously make consensus decisions, independent of the behavior of any other instances

Dagstuhl Research Online Publication Server

Stab-Forests: Dynamic Data Structures for Efficient Temporal Query Processing

Author: Hellings Jelle
Wu Yuqing
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th International Symposium on Temporal Representation and Reasoning (TIME 2020)
Publication date: 01/01/2020
Field of study

Many sources of data have temporal start and end attributes or are created in a time-ordered manner. Hence, it is only natural to consider joining datasets based on these temporal attributes. To do so efficiently, several internal-memory temporal join algorithms have recently been proposed. Unfortunately, these join algorithms are designed to join entire datasets and cannot efficiently join skewed datasets in which only few events participate in the join result. To support high-performance internal-memory temporal joins of skewed datasets, we propose the skip-join algorithm, which operates on stab-forests. The stab-forest is a novel dynamic data structure for indexing temporal data that allows efficient updates when events are appended in a time-based order. Our stab-forests efficiently support not only traditional temporal stab-queries, but also more general multi-stab-queries. We conducted an experimental evaluation to compare the skip-join algorithm with state-of-the-art techniques using real-world datasets. We observed that the skip-join algorithm outperforms other techniques by an order of magnitude when joining skewed datasets and delivers comparable performance to other techniques on non-skewed datasets

Dagstuhl Research Online Publication Server

Practical View-Change-Less Protocol through Rapid View Synchronization

Author: Hellings Jelle
Kang Dakai
Rahnama Sajjad
Sadoghi Mohammad
Publication venue
Publication date: 07/06/2023
Field of study

The emergence of blockchain technology has renewed the interest in consensus-based data management systems that are resilient to failures. To maximize throughput of these systems, we have recently seen several prototype consensus solutions that optimize for throughput at the expense of overall implementation complexity, high costs, and reliability. Due to this, it remains unclear how these prototypes will perform in real-world environments. In this paper, we present the Practical View-Change-Less Protocol PVP, a high-throughput, simple, and reliable consensus protocol. Central to PVP is the combination of (1) a chained consensus design for replicating requests with a reduced message cost; (2) the novel Rapid View Synchronization protocol that enables robust and low-cost failure recovery; and (3) a high-performance concurrent consensus architecture in which independent instances of the chained consensus operate concurrently to process requests with high throughput and without single-replica bottlenecks. Due to the concurrent consensus architecture, PVP greatly outperforms traditional primary-backup consensus protocols such as PBFT (by up to 430%), Narwhal (by up to 296%), and HotStuff (by up to 3803%). Due to its reduced message cost, PVP is even able to outperform RCC, a state-of-the-art high-throughput concurrent consensus protocol, by up to 23%. Furthermore, PVP is able to maintain a stable and low latency and consistently high throughput even during failures.Comment: 16 pages, 14 figure

arXiv.org e-Print Archive

I/O efficient bisimulation partitioning on very large directed acyclic graphs

Author: Fletcher George H. L.
Haverkort Herman
Hellings Jelle
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

In this paper we introduce the first efficient external-memory algorithm to compute the bisimilarity equivalence classes of a directed acyclic graph (DAG). DAGs are commonly used to model data in a wide variety of practical applications, ranging from XML documents and data provenance models, to web taxonomies and scientific workflows. In the study of efficient reasoning over massive graphs, the notion of node bisimilarity plays a central role. For example, grouping together bisimilar nodes in an XML data set is the first step in many sophisticated approaches to building indexing data structures for efficient XPath query evaluation. To date, however, only internal-memory bisimulation algorithms have been investigated. As the size of real-world DAG data sets often exceeds available main memory, storage in external memory becomes necessary. Hence, there is a practical need for an efficient approach to computing bisimulation in external memory. Our general algorithm has a worst-case IO-complexity of O(Sort(|N| + |E|)), where |N| and |E| are the numbers of nodes and edges, resp., in the data graph and Sort(n) is the number of accesses to external memory needed to sort an input of size n. We also study specializations of this algorithm to common variations of bisimulation for tree-structured XML data sets. We empirically verify efficient performance of the algorithms on graphs and XML documents having billions of nodes and edges, and find that the algorithms can process such graphs efficiently even when very limited internal memory is available. The proposed algorithms are simple enough for practical implementation and use, and open the door for further study of external-memory bisimulation algorithms. To this end, the full open-source C++ implementation has been made freely available

arXiv.org e-Print Archive